Skip to content

Add a stop-the-world, serial Compressor #1340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jul 24, 2025
Merged

Conversation

no-defun-allowed
Copy link
Collaborator

This PR adds a Compressor-esque garbage collector, which is based on MarkCompact but uses a mark bitmap for forwarding metadata like the Compressor.

@k-sareen k-sareen self-requested a review July 15, 2025 05:37
@no-defun-allowed
Copy link
Collaborator Author

The current failure to compile mock_test_allocator_info is a one-line fix, but I don't think the mock VM tests are going to work, as the mock VM uses header metadata and the Compressor requires mark bits to be in a side bitmap.

@k-sareen
Copy link
Collaborator

Thanks for the PR @no-defun-allowed! The major thing I see is that currently the Compressor plan is not using the LOS. Was this intentional or accidental?

@no-defun-allowed
Copy link
Collaborator Author

Thanks for the PR @no-defun-allowed! The major thing I see is that currently the Compressor plan is not using the LOS. Was this intentional or accidental?

Definitely accidental. Thanks for the review.

@qinsoon
Copy link
Member

qinsoon commented Jul 16, 2025

The current failure to compile mock_test_allocator_info is a one-line fix, but I don't think the mock VM tests are going to work, as the mock VM uses header metadata and the Compressor requires mark bits to be in a side bitmap.

You can cherry pick this commit to the PR: qinsoon@ac90d4b.

It runs mock VM with side metadata. Compressor currently fails for tests that are not using contiguous space.

@no-defun-allowed
Copy link
Collaborator Author

You can cherry pick this commit to the PR: qinsoon@ac90d4b.

Thanks!

It runs mock VM with side metadata. Compressor currently fails for tests that are not using contiguous space.

Right. Should we skip such tests when running with the Compressor?

@qinsoon
Copy link
Member

qinsoon commented Jul 16, 2025

It runs mock VM with side metadata. Compressor currently fails for tests that are not using contiguous space.

Right. Should we skip such tests when running with the Compressor?

Is there any intrinsic reason why the compressor cannot run with discontiguous spaces? I feel there is none. It would be better to allow running Compressor with discontiguous spaces.

At the same time, there might be practical reasons that this might be not easy, or it requires too much efforts. It is also okay to skip those tests.

@no-defun-allowed
Copy link
Collaborator Author

Is there any intrinsic reason why the compressor cannot run with discontiguous spaces? I feel there is none. It would be better to allow running Compressor with discontiguous spaces.

I feel that way too. The judgement was more about how closely we want to follow the Compressor paper - one simple approach to support discontiguous spaces would be to compress [sic] each chunk of a space, which would also give a simple way to parallelise (which is not the way in the paper).

The block offset vector is currently a Rust Vec<AtomicUsize> which I do the address-to-block calculation assuming a contiguous heap; I should be using some other metadata mechanism for that anyway.

@k-sareen
Copy link
Collaborator

The block offset vector is currently a Rust Vec which I do the address-to-block calculation assuming a contiguous heap; I should be using some other metadata mechanism for that anyway.

For what its worth, ART also uses a uint32_t* for the offset vector calculation.

@no-defun-allowed
Copy link
Collaborator Author

Now the block offset vector and the marks are in their own local metadata. I'm still undecided on if and how to support discontiguous spaces, since the forwarding algorithm in the Compressor paper seems very tied to having a contiguous space.

@qinsoon
Copy link
Member

qinsoon commented Jul 18, 2025

The failures for 32 bits was caused by this: #1342. However, you reverted the side metadata commit for mock VM, so the bug won't appear in the PR any way.

@qinsoon
Copy link
Member

qinsoon commented Jul 18, 2025

Now the block offset vector and the marks are in their own local metadata. I'm still undecided on if and how to support discontiguous spaces, since the forwarding algorithm in the Compressor paper seems very tied to having a contiguous space.

If that's the case, just skip the tests that use discontiguous space for Compressor.

@no-defun-allowed
Copy link
Collaborator Author

I added a MMTK_PLAN=discontiguous set to the CI script, for tests which require discontiguous heaps. The script also removes the Compressor from all on i686, since i686 defaults to a discontiguous heap too; some tests like test_vm_layout_default should still be tested on x86_64 but will panic on i686.

@qinsoon qinsoon added the PR-extended-testing Run extended tests for the pull request label Jul 22, 2025
@wks
Copy link
Collaborator

wks commented Jul 22, 2025

I'm happy to approve this PR unless @wks you have some concerns?

Sorry about the delay. I'll review this PR now.

@k-sareen
Copy link
Collaborator

Forgot to disable compressed oops in the tests. Fixed now hopefully.

@wks
Copy link
Collaborator

wks commented Jul 22, 2025

A high-level question: Will this algorithm work if some objects have stricter alignment requirements than others? For example, object A is 8-byte aligned, and object B is 16-byte aligned, but A is 40 bytes in size, not a multiple of 16 bytes. They are laid out like this. A and B are bytes in object A and B, respectively. . is a gap byte between objects. m is the mark bit.

0        8        16       24       32       40       48       56       64       
AAAAAAAA AAAAAAAA AAAAAAAA AAAAAAAA AAAAAAAA ........ BBBBBBBB BBBBBBBB
m                                   m                 m        m

The algorithm described in forwarding::Transducer will think there are 40 live bytes before object B. But since B requires 16-byte alignment, it cannot be moved to offset 40. It must be moved to offset 48 instead.

FYI, the existing mark compact space adds padding bits for the alignment when calculating forwarding addresses.

// MarkCompactSpace::calculating_forwarding_pointer
            for obj in self
                .linear_scan_objects(from_start..from_end)
                .filter(Self::to_be_compacted)
            {
                let copied_size =
                    VM::VMObjectModel::get_size_when_copied(obj) + Self::HEADER_RESERVED_IN_BYTES;
                let align = VM::VMObjectModel::get_align_when_copied(obj);
                let offset = VM::VMObjectModel::get_align_offset_when_copied(obj);
                // move to_cursor to aliged start address
                to_cursor = align_allocation_no_fill::<VM>(to_cursor, align, offset);
                // move to next to-block if there is no sufficient memory in current region

If Compressor cannot handle this case and it is hard to fix, we can temporarily check VMBinding::MAX_ALIGNMENT when initializing the Compressor plan and panic if we cannot support that VM.

@no-defun-allowed
Copy link
Collaborator Author

My code doesn't check alignment at all, and I'm not sure if there is a good way to implement variable alignment; we don't get much freedom in computing forwarding pointers from the mark bitmap. (The same applies to supporting discontiguous heaps with one Compressor setup/without regions.)

Is the size of an object always a multiple of its alignment? If not, I probably should take the alignment into account when determining where the last word of an object is.

Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MMTk, the starting address of an object is not necessarily the raw address of the ObjectReference. See the doc comment of ObjectReference in src/util/address.rs. Currently, JikesRVM may add a hash code in front of an object to implement address-based hashing. CRuby currently prepends a word to store the object size (which is a workaround and is supposed to be removed in the future).

But some VMs guarantee that the starting address is always the ObjectReference's raw address. They will set ObjectModel::UNIFIED_OBJECT_REFERENCE_ADDRESS to true, and it will simplify some algorithms inside mmtk-core. Currently the OpenJDK guarantees this.

So you may check if UNIFIED_OBJECT_REFERENCE_ADDRESS is true and reject unsupported bindings when using Compressor.

You can also try to fix the code to support VMs where ObjectReference is not necessarily the starting address. That may involve setting the mark bits at the starting address instead of at ObjectReference, and scanning the VO bits in addition to the mark bits to find the ObjectReference when copying.


pub fn test_and_mark(object: ObjectReference) -> bool {
let old = forwarding::MARK_SPEC.fetch_or_atomic(
object.to_raw_address(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you set the mark bit on the raw address. But the raw address may not be the starting address of an object (unless UNIFIED_OBJECT_REFERENCE_ADDRESS is true).

let mut in_object = false;
MARK_SPEC.scan_non_zero_values::<u8>(start, end, &mut |addr: Address| {
if !in_object {
let object = ObjectReference::from_raw_address(addr).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the starting address may not be the raw address of ObjectReference (unless UNIFIED_OBJECT_REFERENCE_ADDRESS is true). If the mark bits are set on the starting and the ending words of an object, we will not be able to obtain ObjectReference from the starting address alone. But if we also use the VO bits, we can scan the VO bits to identify the ObjectReference.

Note that the existing MarkCompactSpace uses the VO bits as a local metadata even when we don't enable global VO bits.

Also note that we have ObjectModel::get_reference_when_copied_to(from: ObjectReference, to: Address) -> ObjectReference;. It doesn't work here. It needs a from reference which points to the object before copying in order to get the type information and compute the distance between the starting address and the ObjectReference.

@wks
Copy link
Collaborator

wks commented Jul 22, 2025

You can also try to fix the code to support VMs where ObjectReference is not necessarily the starting address. That may involve setting the mark bits at the starting address instead of at ObjectReference, and scanning the VO bits in addition to the mark bits to find the ObjectReference when copying.

Well, it may be hard to do the testing. JikesRVM is legacy software and needs special building environments to test. Neither CRuby nor Julia can use mark-compact because they need object pinning which mark-compact cannot support. Maybe we just support VMs where UNIFIED_OBJECT_REFERENCE_ADDRESS == true, and leave JikesRVM to another day.

@no-defun-allowed
Copy link
Collaborator Author

"If the mark bits are set on the starting and the ending words of an object, we will not be able to obtain ObjectReference from the starting address alone" sounds conclusive enough. I'll assert UNIFIED_OBJECT_REFERENCE_ADDRESS for now.

@k-sareen
Copy link
Collaborator

JikesRVM won't run currently anyway since this implementation of the Compressor requires contiguous spaces. When we have discontiguous space support, then maybe we can look at JikesRVM support.

@no-defun-allowed
Copy link
Collaborator Author

I'll assert UNIFIED_OBJECT_REFERENCE_ADDRESS for now.

But then I can't run mock VM tests with the Compressor, and adding const UNIFIED_OBJECT_REFERENCE_ADDRESS: bool = true; to MockVM produces this test failure:

---- vm::tests::mock_tests::mock_test_internal_ptr_normal_object::interior_pointer_in_normal_object stdout ----
start = 0x20000000000, end = 0x20000000010, obj = 0x20000000008
thread 'vm::tests::mock_tests::mock_test_internal_ptr_normal_object::interior_pointer_in_normal_object' panicked at src/util/address.rs:627:9:
The binding claims unified object reference address, but for object reference 0x20000000008, ref_to_object_start() returns 0x20000000000

@wks
Copy link
Collaborator

wks commented Jul 22, 2025

I'll assert UNIFIED_OBJECT_REFERENCE_ADDRESS for now.

But then I can't run mock VM tests with the Compressor, and adding const UNIFIED_OBJECT_REFERENCE_ADDRESS: bool = true; to MockVM produces this test failure:

---- vm::tests::mock_tests::mock_test_internal_ptr_normal_object::interior_pointer_in_normal_object stdout ----
start = 0x20000000000, end = 0x20000000010, obj = 0x20000000008
thread 'vm::tests::mock_tests::mock_test_internal_ptr_normal_object::interior_pointer_in_normal_object' panicked at src/util/address.rs:627:9:
The binding claims unified object reference address, but for object reference 0x20000000008, ref_to_object_start() returns 0x20000000000

Oops. There are test cases for the cases where the starting address is not equal to the ObjectReference. (IIRC I added some similar tests, too, but they seem to have gone.)

Maybe we can use Cargo features to conditionally enable compressor-related tests and set UNIFIED_OBJECT_REFERENCE_ADDRESS to true, and conditionally disable those tests that conflict with those options.

@k-sareen
Copy link
Collaborator

Maybe this can be a warn!() about if UNIFIED_OBJECT_REFERENCE_ADDRESS is not true. Might let us side-step this issue in the short-term.

@qinsoon
Copy link
Member

qinsoon commented Jul 22, 2025

So currently the compressor has the following requirements/limitations:

  • Requires contiguous space
  • Requires object references to be the same as the object start addresses
  • Objects should not grow size during copying.
  • Ignores object alignment (assumes objects are aligned to word-size?)

I feel it is worth discussing over these points in a meeting, and see what are intrinsic to the algorithm, and what are introduced due to the implementation.

It is not necessary to address the above issues before this PR can be merged, but it is good for us to understand the issues.

@qinsoon
Copy link
Member

qinsoon commented Jul 23, 2025

The following is my understanding about the limitations:


EDIT: I realized that an annoying thing is that during forwarding, you can no longer access object metadata. Calculating forwarding address can only be done with pre-computed side metadata. That means if we access any object metadata during the pre-computing (such as copied size, alignment, etc), those metadata needs to be recorded somehow in the side metadata so that during forwarding, we can do the same calculation again using side metadata alone.


* Requires contiguous space

Supporting discontiguous space seems doable, but it complicates things, especially the forwarding part.

First, encoding should be fine with discontiguous space, as the offset vector itself does not include any address. During computing offset vector, the current code uses the next block's start address to calculate the remaining live bytes in the previous block -- this assumes contiguous spaces. We could visit the block end address once to calculate the remaining live bytes. This should be easy to fix.

self.live + (current_position - self.last_bit_visited) + 1

Then, forwarding. The forwarding address is computed as below:

self.first_address + state.live

To support discontiguous space, we cannot use the start of the space, and we need some extra information (e.g. a table that maps block index to block start) to know the current block start, then apply the offset internal to a block to it -- state.live % block_size.


* Requires object references to be the same as the object start addresses

This should be purely an implementation limitation. In theory, the implementation could always use object.to_object_start() and never use object.to_raw_address(). And when it needs to return an object reference from a forwarded address, use ObjectModel::get_reference_when_copied_to.


* Ignores object alignment (assumes objects are aligned to word-size?)

It is also possible to properly support object alignment.

When we encounter an object during computing offset vector, we can compute the required alignment in the same way as mark compact. The diff between the aligned to-cursor and the previous to-cursor is the alignment padding that can be added to the live bytes prior to the object.

let align = VM::VMObjectModel::get_align_when_copied(obj);
let offset = VM::VMObjectModel::get_align_offset_when_copied(obj);
// move to_cursor to aliged start address
to_cursor = align_allocation_no_fill::<VM>(to_cursor, align, offset);


* Objects should not grow size during copying.

if we calculate object size through the mark bit, it will be the original size before copying. But we can query ObjectModel::get_size_when_copied, and apply the diff to the size when we calculating live bytes for the offset vector.

@no-defun-allowed
Copy link
Collaborator Author

When we encounter an object during computing offset vector

apply the diff to the size when we calculating live bytes for the offset vector

The issue is that we don't only use the offset vector - we also scan the mark bitmap between the start of the block containing an object and that object. So we need to be able to encode the space taken in the mark bitmap. I think this is possible for alignment/padding, as there should be enough space in the mark bitmap to adjust the first/last word marks, but there might not be with objects which grow when copied.

@no-defun-allowed
Copy link
Collaborator Author

no-defun-allowed commented Jul 23, 2025

To support discontiguous space, we cannot use the start of the space, and we need some extra information (e.g. a table that maps block index to block start) to know the current block start, then apply the offset internal to a block to it -- state.live % block_size.

The forwarding loop doesn't account for any chunk boundaries either. I'm undecided on if preventing objects from crossing boundaries can be done with the mark bitmap or not.

Presently I plan to side-step the issues with the forwarding calculation by compacting regions of the heap individually, with regions smaller than or equal in size to the contiguous chunks of a discontiguous heap.

@k-sareen
Copy link
Collaborator

@qinsoon Should we skip the Compressor for MockVM tests for the time-being then?

@qinsoon
Copy link
Member

qinsoon commented Jul 24, 2025

@qinsoon Should we skip the Compressor for MockVM tests for the time-being then?

Sounds good to me. Just create an issue on how we can allow MockVM to run tests for Compressor (I understand MockVM tests has its own limitation), or skip the issue if there is any plan to fix it in the near future.

@k-sareen
Copy link
Collaborator

@no-defun-allowed Can you just temporarily set ALL_PLANS to ALL_DISCONTIGUOUS_PLANS in the test script and add a comment there? That should work I think. Then we can merge this PR.

@no-defun-allowed
Copy link
Collaborator Author

@no-defun-allowed Can you just temporarily set ALL_PLANS to ALL_DISCONTIGUOUS_PLANS in the test script and add a comment there? That should work I think. Then we can merge this PR.

Done.

Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@k-sareen k-sareen added this pull request to the merge queue Jul 24, 2025
Merged via the queue into mmtk:master with commit d93262b Jul 24, 2025
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-extended-testing Run extended tests for the pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants